Search CORE

6 research outputs found

Programming model abstractions for optimizing I/O intensive applications

Author: Elshazly Hatem Mohamed Abdelfattah Eid
Publication venue: Universitat Politècnica de Catalunya
Publication date: 28/01/2021
Field of study

This thesis contributes from the perspective of task-based programming models to the efforts of optimizing I/O intensive applications. Throughout this thesis, we propose programming model abstractions and mechanisms that target a twofold objective: from the one hand, improve the I/O and total performance of applications on nowadays complex storage infrastructures. From the other hand, achieve such performance improvement without increasing the complexity of applications programming. The following paragraphs briefly summarize each of our contributions. First, towards exploiting compute-I/O patterns of I/O intensive applications and transparently improving I/O and total performance, we propose a number of abstractions that we refer to as I/O Awareness abstractions. An I/O aware task-based programming model is able to separate the handling of I/O and computations by supporting I/O Tasks. The execution of such tasks can overlap with compute tasks execution. Moreover, we provide programming model support to improve I/O performance by addressing the issue of I/O congestion. This is achieved by using Storage Bandwidth Constraints to control the level of task parallelism. We support two types of such constraints: (i) Static storage bandwidth constraints that are manually set by application developers. (ii) Auto-tunable constraints that are automatically set and tuned throughout the execution of application. Second, in order to exploit the heterogeneity of modern storage systems to improve performance in a transparent manner, we propose a set of capabilities that we refer to as Storage heterogeneity Awareness. A storage-heterogeneity aware task-based programming model builds on the concepts and abstractions that are introduced in the first contribution to improve the I/O performance of applications on heterogeneous storage systems. More specifically, such programming models support the following features: (i) abstracting the heterogeneity of the storage devices and exposing them as one hierarchical storage resource. (ii) supporting dedicated I/O scheduling. (iii) Finally, we introduce a mechanism that automatically and periodically flushes obsolete data from higher storage layers to lower storage layers. Third, targeting increasing parallelism levels of applications, we propose a Hybrid Programming Model that combines task-based programming models and MPI. In this programming model, tasks are used to achieve coarse-grained parallelism on large-scale distributed infrastructures, whereas MPI is used to gain fine-grained parallelism by parallelizing tasks execution. Such a hybrid programming model offers the possibility to enable parallel I/O and high-level I/O libraries in tasks. We enable such a hybrid programming model by supporting Native MPI Tasks. These tasks are native to the programming model for two reasons: they execute task code as opposed to calling external MPI binaries or scripts. Also, the data transfers and input/output handling is done in a completely transparent manner to application developers. Therefore, increasing parallelism levels while easing the design and programming of applications. Finally, to exploit the inherent parallelism opportunities in applications and overlap computation with I/O, we propose an Eager mechanism for releasing data dependencies. Unlike the traditional approach for releasing dependencies, eagerly releasing data dependencies allows successor tasks to be released for execution as soon as their data dependencies are ready, without having to wait for predecessor task(s) to completely finish execution. In order to support the eager-release of data dependencies, we describe the following core modifications to the design of task-based programming models: (i) defining and managing data dependency relationships as parameter-aware dependencies (ii) a mechanism for notifying the programming model that an output data has been generated before the execution of the producer task ends.Aquesta tesi contribueix des de la perspectiva dels models de programació basats en tasques als esforços d’optimitzar les aplicacions intensives de I/O. Al llarg d'aquesta tesi, proposem abstraccions i mecanismes del model de programació que persegueixen un doble objectiu: per una banda, millorar la I/O i el rendiment total de les aplicacions a les complexes infraestructures d'emmagatzematge de l'actualitat. D'altra banda, aconsegueixi aquesta millora del rendiment sense augmentar la complexitat de la programació d'aplicacions. Els paràgrafs següents resumeixen cadascuna de les nostres contribucions. En primer lloc, proposem una sèrie d'abstraccions a què ens referim com a abstraccions de consciència de I/O. Un model de programació basat en tasques amb reconeixement d'I/O pot separar el maneig d'I/O i els càlculs en admetre Tasques d'I/O. L'execució d'aquestes tasques es pot superposar amb l'execució de tasques de càlcul. A més, proporcionem suport de model de programació per millorar el rendiment d'I/O en abordar el problema de la congestió d'I/O. Això s'aconsegueix mitjançant l'ús de restriccions d'amplada de banda d'emmagatzematge per controlar el nivell de paral·lelisme de tasques. Admetem dos tipus d'aquestes restriccions: estàtic i autoajustable. En segon lloc, proposem un conjunt de capacitats a què ens referim com a Consciència d'heterogeneïtat d'emmagatzematge. Un model de programació basat en tasques conscient de l'heterogeneïtat de l'emmagatzematge es basa en els conceptes i les abstraccions que s'introdueixen en la primera contribució per millorar el rendiment d'I/O de les aplicacions en sistemes d'emmagatzematge heterogenis. Més específicament, aquests models de programació admeten les característiques següents: (i) abstreure l'heterogeneïtat dels dispositius d'emmagatzematge i exposar-los com a recurs d'emmagatzematge jeràrquic. (ii) admetre la programació d'I/O dedicada. (iii) Finalment, presentem un mecanisme que descarrega automàticament i periòdicament les dades obsoletes de les capes d'emmagatzematge superiors a les capes d'emmagatzematge inferiors. En tercer lloc, proposem un model de programació híbrid que combina models de programació basats en tasques i MPI. En aquest model de programació, les tasques s'utilitzen per aconseguir un paral·lelisme de gra gruixut en infraestructures distribuïdes a gran escala, mentre que MPI es fa servir per obtenir un paral·lelisme de gra fi en paral·lelitzar l'execució de tasques. Un model d'aquest tipus de programació híbrid ofereix la possibilitat d'habilitar I/O paral·leles i biblioteques d'I/O d'alt nivell en tasques. Habilitem un model de programació híbrid d'aquest tipus en admetre tasques MPI natives que executen codi de tasca en lloc de trucar a binaris o scripts MPI externs. A més, la transferència de dades i el maneig d’entrada / sortida es realitza d’una manera completament transparent per als desenvolupadors d’aplicacions. Per tant, augmenta els nivells de paral·lelisme alhora que se'n facilita el disseny i la programació d'aplicacions. Finalment proposem un mecanisme Eager per alliberar dependències de dades. A diferència de l'enfocament tradicional per alliberar dependències, alliberar amb entusiasme les dependències de dades permet que les tasques successores s'alliberin per a la seva execució tan aviat com les dependències de dades estiguin llestes, sense haver d'esperar que les tasques predecessores acabin completament l'execució. Per tal de donar suport a l'alliberament ansiós de les dependències de dades, descrivim les següents modificacions centrals al disseny de models de programació basats en tasques: (i) definir i administrar les relacions de dependència de dades com a dependències conscients de paràmetres (ii ) un mecanisme per notificar la model de programació que s'ha generat una dada de sortida abans que finalitzi l'execució de la tasca de productor.Postprint (published version

UPCommons. Portal del coneixement obert de la UPC

Tesis Doctorals en Xarxa

Enhanced performance using hybrid programming models of task-based workflows and MPI

Author: Badia Sala Rosa Maria
Elshazly Hatem Mohamed Abdelfattah Eid
Publication venue: Barcelona Supercomputing Center
Publication date: 01/05/2020
Field of study

While MPI [1] + X (where X is another parallel programming model) has been proposed and used by the community, we propose a hybrid programming model that combines taskbased model + MPI. Task-based workflows offer the necessary abstraction to simplify the application development for large scale execution, and supporting tasks that launch MPI executions enables to exploit the performance capabilities of manycore systems. Hence, application programmers can get the maximum performance out of the underlying systems without compromising the programmability of the application. We present an extension to PyCOMPSs framework [2], a task-based parallel programming model for the execution of Python applications. Throughout this paper, we name the tasks that natively execute MPI code as Native MPI Tasks, as opposed to tasks that call external MPI binaries. Having Native MPI tasks as part of the programming model means that in the same source file users can have two types of task: tasks that execute MPI code and other tasks that execute non- MPI code. PyCOMPSs organizes the tasks in Directed Acyclic Graph (DAG) and manages their scheduling and execution, hence users can focus only on the logic of the task

UPCommons. Portal del coneixement obert de la UPC

Storage-heterogeneity aware task-based programming models to optimize I/O intensive applications

Author: Badia Sala Rosa Maria
Ejarque Artigas Jorge
Elshazly Hatem Mohamed Abdelfattah Eid
Publication venue: 'Institute of Electrical and Electronics Engineers (IEEE)'
Publication date: 01/12/2022
Field of study

Task-based programming models have enabled the optimized execution of the computation workloads of applications. These programming models can take advantage of large-scale distributed infrastructures by allowing the parallel and distributed execution of applications in high-level work components called tasks. Nevertheless, in the era of Big Data and Exascale, the amount of data produced by modern scientific applications has already surpassed terabytes and is rapidly increasing. Hence, I/O performance became the bottleneck to overcome in order to achieve more total performance improvement. New storage technologies offer higher bandwidth and faster solutions than traditional Parallel File Systems (PFS). Such storage devices are deployed in modern day infrastructures to boost I/O performance by offering a fast layer that absorbs the generated data. Therefore, it is necessary for any programming model targeting more performance to manage this heterogeneity and take advantage of it to improve the I/O performance of applications. Towards this goal, we propose in this paper a set of programming model capabilities that we refer to as Storage-Heterogeneity Awareness. Such capabilities include: (i) abstracting the heterogeneity of storage systems, and (ii) optimizing I/O performance by supporting dedicated I/O schedulers and an automatic data flushing technique. The evaluation section of this paper presents the performance results of different applications on the MareNostrum CTE-Power heterogeneous storage cluster. Our experiments demonstrate that a storage-heterogeneity aware programming model can achieve up to almost 5x I/O performance speedup and 48% total time improvement compared to the reference PFS-based usage of the execution infrastructure.This work is partially supported by the European Union through the Horizon 2020 research and innovation programme under contracts 721865 (EXPERTISE Project) by the Spanish Government (PID2019-107255GB) and the Generalitat de Catalunya (contract 2014-SGR-1051).Peer ReviewedPostprint (author's final draft

UPCommons. Portal del coneixement obert de la UPC

Optimizing execution on large-scale infrastructures by integrating task-based workflows and MPI

Author: Badia Sala Rosa Maria
Ejarque Jorge
Elshazly Hatem Mohamed Abdelfattah Eid
Lordan Francesc
Publication venue: Barcelona Supercomputing Center
Publication date: 01/05/2021
Field of study

UPCommons. Portal del coneixement obert de la UPC

Performance meets programmabilty: Enabling native Python MPI tasks in PyCOMPSs

Author: Badia Sala Rosa Maria
Ejarque Artigas Jorge
Elshazly Hatem Mohamed Abdelfattah Eid
Lordan Gomis Francesc
Publication venue: 'Institute of Electrical and Electronics Engineers (IEEE)'
Publication date: 01/01/2020
Field of study

©2020 IEEE. Personal use of this material is permitted. Permission from IEEE must be obtained for all other uses, in any current or future media, including reprinting/republishing this material for advertising or promotional purposes,creating new collective works, for resale or redistribution to servers or lists, or reuse of any copyrighted component of this work in other works.The increasing complexity of modern and future computing systems makes it challenging to develop applications that aim for maximum performance. Hybrid parallel programming models offer new ways to exploit the capabilities of the underlying infrastructure. However, the performance gain is sometimes accompanied by increased programming complexity. We introduce an extension to PyCOMPSs, a high-level task-based parallel programming model for Python applications, to support tasks that use MPI natively as part of the task model. Without compromising application's programmability, using Native MPI tasks in PyCOMPSs offers up to 3x improvement in total performance for compute intensive applications and up to 1.9x improvement in total performance for I/O intensive applications over sequential implementation of the tasks.This work is partially supported by the European Union through the Horizon 2020 research and innovation programme under contracts 721865 (EXPERTISE Project) and 800898 (ExaQUte project), by the Spanish Government (TIN2015- 65316-P) and the Generalitat de Catalunya (contract 2014- SGR-1051).Peer ReviewedPostprint (author's final draft

UPCommons. Portal del coneixement obert de la UPC

Accelerated execution via eager-release of dependencies in task-based workflows

Author: Badia Sala Rosa Maria
Ejarque Artigas Jorge
Elshazly Hatem Mohamed Abdelfattah Eid
Lordan Gomis Francesc
Publication venue: 'SAGE Publications'
Publication date: 01/07/2021
Field of study

Task-based programming models offer a flexible way to express the unstructured parallelism patterns of nowadays complex applications. This expressive capability is required to achieve maximum possible performance for applications that are executed in distributed execution platforms. In current task-based workflows, tasks are launched for execution when their data dependencies are satisfied. However, even though the data dependencies of a certain task might have been already produced, the execution of this task will be delayed until its predecessor tasks completely finish their execution. As a consequence of this approach of releasing dependencies, the amount of parallelism inherent in applications is limited and performance improvement opportunities are wasted. To mitigate this limitation, we propose an eager approach for releasing data dependencies. Following this approach, the execution of tasks will not be delayed until their predecessor tasks completely finish their execution, instead, tasks will be launched for execution as soon as their data requirements are available. Hence, more parallelism is exposed and applications can achieve higher levels of performance by overlapping the execution of tasks. Towards achieving this goal, in this paper we propose applying two changes to task-based workflow systems. First, modifying the dependency relationships of tasks to be specified not only in terms of predecessor and successor tasks but also in terms of the data that caused these dependencies. Second, triggering the release of dependencies as soon as a predecessor task generates the output data instead of having to wait until the end of the predecessor execution to release all of its dependencies. We realize this proposal using PyCOMPSs: a task-based programming model for parallelizing Python applications. Our experiments show that using an eager approach for releasing dependencies achieves more than 50% performance improvement in the total execution time as compared to the default approach of releasing dependencies.This work is partially supported by the European Union through the Horizon 2020 research and innovation programme under contracts 721865 (EXPERTISE Project) by the Spanish Government (SEV2015-0493,TIN2015-65316-P) and the Generalitat de Catalunya (contract 2014-SGR1051).Peer ReviewedPostprint (author's final draft

UPCommons. Portal del coneixement obert de la UPC